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A Comparative Study of Relationship 
Between Mathematics and Science Achievement at the 8th Grade 

Abstract 

Mathematics and science achievements have been assessed in the Third 
International Mathematics and Science Study (TIMSS) and its repetition (TIMSS-R). 
Meanwhile, the released TIMSS and TIMSS-R reports are largely divided into subject 
domains. To merge the research outcomes, this study is focused on an examination of the 
relationship between mathematics and science achievements. Moderate correlation 
coefficients have been found from the TIMSS and TIMSS-R data analyses. Different 
measurement scales are analyzed to articulate the correlation coefficients with student 
average scores in each subject. These empirical findings may help mathematics and 
science educators assess the need of curriculum integration advocated by several 
professional organizations in the U.S. and other nations. 
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A Comparative Study of Relationship 
Between Mathematics and Science Achievement at the 8th Grade 

The global market competition has been one of the driving forces toward 
enhancement of educational accountability in many countries. As a result, more coherent 
guidelines have been developed over the last decade to strengthen curriculum standards 
in mathematics and science education. In the United States, professional organizations 
produced documents to advocate curriculum articulation between mathematics and 
science education (e.g.. National Council for Teachers of Mathematics, 1998; National 
Research Council, 1996). Meanwhile, educators in the United Kingdom adopted 
interdisciplinary approaches in development of its national curriculum (Nixon, 1991). 
The Curriculum Council of Western Australia (1998) also recommended teaching 
methods across subject boundaries (Venville, Wallace, Rennie, & Malone, 1998). 
Implementation of these new initiatives around the world ranges from thematic units to 
an entirely combined curriculum (Lonning, DeFranco, & Weinland, 1998). According to 
Haigh and Rehfeld (1995), “most of these attempts have been based upon the assumption 
that integration increases student achievement in both mathematics and science” (p. 241). 

In the late 1990s, large-scale databases have been released from the Third 
International Mathematics and Science Study (TIMSS) in 1995 and a repeat of the 
TIMSS project (TIMSS-R) in 1999. Widely cited as an international benchmark, the 
TIMSS and TIMSS-R projects incorporated both mathematics and science tests to assess 
student academic performance (e.g., Martin & Mullis, 1996; Mullis, et al., 2000). In this 
study, correlation coefficients between the mathematics and science scores are analyzed 
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to assess the inter-subject relationship at the 8th grade using the TIMSS and TIMSS-R 
databases. 

Despite the persistent push for curriculum articulation in several nations, 
empirical evidence is yet to be established to support curriculum integration. In terms of 
the content structure, the relationship between mathematics and science could be 
asymmetric. “Unlike the mathematics teacher who can choose to avoid science, the 
science teacher is not able to cover most topics without calling on mathematical concepts 
and skills” (Frykholm & Meyer, 2002, p. 504). Furthermore, the reliance on mathematics 
varies across different science fields. Physics is a subject heavily dependent on 
mathematical preparation. However, the demand is not as strong in biology, and “other 
sciences such as psychology might not yet be ready for the kind of mathematization that 
has taken place in physics” (Orton & Roper, 2000, p. 124). 

On the other hand, whereas it was assumed that “integration would produce 
greater learning outcomes of both mathematics and science, . . . few empirical attempts 
have attempts have been made to test this assumption” (McBride & Silverman, 1991, p. 
286). To date, the released TIMSS and TIMSS-R reports have been largely divided along 
with subject boundaries (e.g., Beaton et al., 1996a, b; Martin et al., 1998, 2001; Mullis, et 
al., 1998, 2001), and no correlation analyses have been conducted on student scores 
between mathematics and science. In this regard, this investigation not only helps assess 
the link between mathematics and science performance, but also enriches the comparative 
research literature by adding more empirical findings across the subject boundaries. 
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Review of the Literature 

Lederman and Niess (1998) observed, “the current reforms have resulted in 
renewed interest in curriculum integration, especially between mathematics and science” 
(p. 281). Despite the development of national standards in the United States and other 
countries (e.g., NCTM 1998; Nixon, 1991; NRC, 1996; Venville et al., 1998), no 
interdisciplinary research has been conducted at the national or international levels to 
analytically address two fundamental topics: (1) a system-wide assessment of correlation 
between mathematics and science achievements; and (2) an examination of the linkage 
between a higher correlation and a higher average score in mathematics or science (see 
review articles by Czemiak, Weber, Sandmann, & Ahem, 1999; Hurley, 2001; Pang & 
Good, 2000). 

Data Selection 

In the United States, the National Assessment of Educational Progress (NAEP) 
has been one of the primary measures to assess the condition of education for more than 
three decades. The NAEP methodology, such as spiral sampling, data imputation, and 
plausible score constmction, has been adapted in the international assessments (Gonzalez 
& Smith, 1997; Pashley & Phillips, 1993). However, in the NAEP data, mathematics and 
science scores were gathered from different student samples across the nation (Allen, 
Carlson, & Zelenak, 1999). Thus, no students took the science and mathematics tests 
concurrently, and no interdisciplinary analysis can be conducted using the NAEP 
database. 
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In contrast, TIMSS and TIMSS-R projects included both mathematics and science 
tests at the 8th grade level. TIMSS researchers were quick at updating their measurement 
scales to maintain consistency on the student assessment. More specifically, the TIMSS- 
R scale was developed from a new three-parameter model to replace the original one- 
parameter model in TIMSS (Matin et al., 2001). Meanwhile, the TIMSS scores have been 
rescaled in TIMSS-R to enhance result comparability between these two projects (Martin, 
Gregory, & Stemler, 2000). In this study, the original and rescaled TIMSS scores are 
analyzed to compare impact of the scale adjustment on correlation coefficients between 
mathematics and science achievements. Furthermore, the TIMSS and TIMSS-R data are 
examined on the new scale to confirm consistency of the research findings between the 
two projects. 

Statistical Computing 

Depending on the data scaling, several options are available for describing linear 
correlations (SAS, 2001). Because student test scores are measured on an interval scale, 
Pearson correlation coefficient is an appropriate choice to assess the relation between 
mathematics and science achievements (Ott, 1993): 

r = cov(xi, X 2 )/sqrt[var(xi)*var(x 2 )] (I) 

Formula (I) indicates dependency of a correlation coefficient on estimates of 
variances [i.e., var(xi), var(x 2 )] and covariance [i.e., cov(xi, X 2 )]. For stratified cluster 
samples gathered in TIMSS/TIMSS-R, an assumption of simple random sampling may 
lead to underestimation of variability in statistical inference (Martin, Gregory, & Stemler, 
2000; Martin & Kelly, 1997). Kish (1965) introduced a concept of design effect (deff) to 
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describe the variance estimation: 

deff=(variance from complex sampling)/(variance from simple random sampling) 
To avoid the underestimation of statistical variability, special software packages 
other than SPSS are needed to account for the complex sampling structure (Cabrera, La 
Nasa, & Burkum, 2002). One of the widely used software packages for survey data 
analyses is an AM program developed by the American Institute of Research (AIR). AIR 
(2003) noted, 

AM is a statistical software package for analyzing data from complex samples, 
especially large-scale assessments such as the National Assessment of 
Educational Progress (NAEP) and the Third International Mathematics and 
Science Studies (TEMSS). (http://am.air.org , p. 1) 

In the released database, TEMSS researchers computed a total of five plausible 
scores in each subject area to represent student achievement, and “one set of the imputed 
plausible scores can be considered as good as another” (Gonzalez & Smith, 1997, ch. 6, 
p. 3). The interchangeability of plausible scores suggests equivalency of the design effect 
among the plausible scores. Under an assumption of invariant deff values between 
mathematics and science scores, the AM software is employed to compute correlation 
coefficients from the TIMSS and TEMSS-R date sets. 

Research Questions 

By viewing the world as a giant education laboratory, the diversified education 
settings may have resulted in different correlation coefficients of student scores between 
mathematics and science. Research questions that guide this investigation are: 
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1. Do student mathematics and science achievements from TIMSS/TIMSS-R 
participating countries fit a linear relationship assumed by the Pearson correlation 
analyses? 

2. Is the higher correlation linked to a higher average science performance in the 
international comparison? 

3. What is the link between the score correlation and mathematics performance? 

4. What are the consistent findings from the result triangulation across TIMSS and 
TIMSS-R projects? 

Methods 

The TIMSS data and program files were downloaded from a public website 
(http://www.timss.org) . The average mathematics and science scores have been replicated 
to show an exact match with the existing results from TIMSS/TIMSS-R reports (Beaton 
et al., 1996a, b; Martin, et al., 2000; Mullis, et al., 2000). On basis of the scores at the 
country level, plots are created to examine the pattern of linearity on the score 
relationship (Question I). Fisher’s (1921) z transformation is conducted in each nation to 
compute an average correlation coefficient among plausible scores in mathematics and 
science. According to Corey, Dunlap, and Burke (1998), "When correlations come from a 
matrix, there is a consistent advantage associated with using [Fisher’s] z'. Across sample 
size and numbers of correlations averaged, bias in average r(z)' is smaller than bias in 
average r" (p. 260). 

To facilitate empincal comparisons of different education systems, correlation 
analyses are conducted at the country level to examine if higher correlation coefficients 
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are linked to higher average scores in mathematics or science (Questions 2 & 3). Patterns 
of the data distribution are plotted at the country level to examine consistency of research 
findings between TIMSS and TIMSS-R databases on the old and new scales (Question 
4). Results of the correlation analysis can be articulated with TIMS S/TIMS S-R scores to 
examine the link between the score correlation and student achievement in mathematics 
or science. 

Results 

Mathematics and science scores in all participating countries have been plotted on 
three scales: (1) TIMSS results on the TIMSS original scale; (2) TIMSS results on the 
new TIMSS-R three-parameter scale; and (3) TIMSS-R results on the new scale (see 
Figures 1-3). The average correlation coefficients from the Fisher’s z transformation are 
listed in Table 1 for each nation. Because student scores represent an achieved curriculum 
in an education system (Linn, 2000; Zabulionis, 2001), the correlation coefficient 
indicates relationship of the achieved curricula between mathematics and science. The 
correlation coefficient has been plotted against academic performance in mathematics 
(Figures 4-6) and science (Figures 7-9). Correlation coefficients have been computed to 
describe relationships between the curriculum link (r) and student scores on both new and 
old scales (Table 2). 

Discussions 

Figures 1-3 indicate a linear pattern of the relationship between mathematics and 
science achievements among the TIMSS/TIMSS-R participating nations. Therefore, 
Pearson r is an appropriate method for describing relationship of the achieved curricula 
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between mathematics and science. The correlation coefficients show a fair amount of 
variability, ranging from Kuwait’s 0.39 on the TIMSS original scale (country id: 414) to 
0.89 for South Africa’s average scores rescaled in TIMSS-R (country id: 717) (Table 1). 
For countries with a performance score below 450 on the TIMSS scale, the range of 
correlation coefficients is larger than the results of top performing countries (Figures 4- 
9). In other words, the score correlations that are too strong or too weak are typically 
linked to poor performance in mathematics or science. This consistent pattern between 
TIMSS and TIMSS-R seems to support a moderate level of integration between 
mathematics and science (e.g., 0.44<r<0.63 on the TIMSS original scale — see Figures 4 
& 7). As a result, perhaps a more balanced position should be taken by school 
professionals to avoid an over emphasis or de-emphasis of integration between 
mathematics and science education. 

In addition, without considering the rescaling of the TIMSS data, the correlation 
of student achievements appears to be positively associated with average national 
performance in mathematics and science (Table 2). However, Figures 4, 5, 7 and 8 show 
an unclear upward or downward pattern from the TIMSS data, which confirms the 
insignificant correlation on the original and new scales (Table 2). For the negative 
correlation from the rescaled TIMSS data (Table 2), Figures 5 and 8 indicate that South 
Africa’s result could be an outlier with low performance scores and a high correlation 
coefficient between mathematics and science achievements (r=0.89). Had this 
ooservation been taken out, the correlation coefficients on the new scale would be 0.19 
and 0.14, instead of -0.15 and -0.25, respectively (Table 2). Therefore, the negative 
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correlation might reflect statistical artifact, and the TIMSS scale transformation did not 
clearly resulted in a significant correlation from the 1995 database. 

On the other hand, the TIMSS-R results from 1999 show a significant correlation 
between Pearson r value and student achievement in mathematics and science (Table 2). 
It should be noted that only ‘Twenty-six countries took part in the TEMSS eighth-grade 
assessments in both 1995 and 1999” (Mullis, et al., 2000, p. 34). More than a dozen 
countries only participated in one of the international studies. Given the difference in 
participating nations, the correlation results could have been affected by the involvement 
of different countries between TEMSS and TEMSS-R. 

Enspection of Table 1 further reveals a gap in the correlation results between the 
original one-parameter scale and the new three-parameter scale (i.e., r_95 < r_new95). 

On the same new scale, the TEMSS and TEMSS-R results show a stronger agreement 
between the correlation coefficients (r_new & r_99). 

En part, this is because the three-parameter scale has taken the guessing effect into 
consideration (Hambleton, & Swaminathan, 1985). Hambleton (1988) noted, “with 
difficult multiple-choice tests, a researcher might anticipate considerable guessing on the 
part of examinees. Needed, therefore, would be a model that could handle this situation” 
(p. 154). 

Because more than 90% TEMSS items are in a multiple-choice format (Lange, 
1997), the result seems to support the effort of TEMSS researchers to rescale the TEMSS 
results, and thus, properly consider the potential impact from guessing (Martin et al., 
2000; Mullis et al., 2000). For instance, the TEMSS instrument has the following 
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mathematics item in a multiple-choice format: 



02. If the price of a can of beans is raised from 60 cents to 75 cents, what is the 
percent increase in the price? 



A. 


15% 


B. 


20% 


C. 


25% 


D. 


30% 



Figuring out the rate increase is a basic mathematical skill required in many scientific 
experiments. With the four options in this question, the probability of obtaining a correct 
answer through random guessing is 25%. In the TIMSS data, only 28% 8th graders from 
all participating nations answered this question correctly! This result not only illustrates 
a need for correcting the guessing effect, but also urges educators to make a concerted 
effort to improve student performance in this joint area between mathematics and 
science. In summary, this empirical data analysis seems to suggest that the call for 
subject articulation is still a valid accountability issue following the long-lasting quest for 
educational improvement. 
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Table 1 



Correlation coefficients between mathematics and science scores 



count ry_id 


r_95 


r_new95 


r_99 


36 


0.61066 


0.72561 


0.73445 


40 


0.60400 


0.74650 


. 


56 


0.53763 


0.72533 


. 


57 


0.56130 


0.72329 


. 


too 


. 


. 


0.70826 


124 


0.50309 


0.60844 


0.65022 


152 


. 


. 


0.67055 


158 


. 


. 


0.79888 


170 


0.46660 


0.70264 


. 


196 


0.60592 


0.72942 


0.71988 


200 


0.55550 


0.71477 


. 


201 


0.59682 


0 . 72885 


. 


203 


. 


. 


0.68582 


208 


0.54204 


0.67358 


. 


246 


. 


. 


0.63206 


250 


0.44642 


0.59251 


. 


280 


0.60900 


0.76270 


. 


300 


0.57560 


0.71952 


. 


344 


0.55414 


0.72474 


0 . 69972 


348 


0.58429 


0.70712 


0.71063 


352 


0.52166 


0.69369 


. 


360 


. 


. 


0.69388 


364 


0.43267 


0.62444 


0.66955 


372 


0.59635 


0.73103 


. 


376 


0.60736 


0.71732 


0.77825 


380 


. 


. 


0.76616 


392 


0.55408 


0.68190 


0 . 72506 


400 


. 


. 


0 . 75202 


410 


0.57265 


0.72091 


0 . 73592 


414 


0.38991 


0.55434 


. 


428 


0.50811 


0.64808 


0.67064 


440 


0.54685 


0.65489 


0.73885 


458 


. 


. 


0.72142 


498 


. 


. 


0.68617 


504 


• 


• 


0.43424 
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Table 1 (continued) 

Correlation coefficients between mathematics and science scores 



country_id 


r_95 


r_new95 


r_99 


528 


0.56049 


0.75407 


0.69678 


554 


0.58807 


0.72179 


0.76637 


578 


0.54790 


0.66924 


. 


608 


0.62747 


0.79023 


0.66908 


620 


0.45528 


0.64280 


. 


642 


0.60999 


0.71681 


0.70437 


643 


0.56386 


0.67507 


0.71900 


702 


0.49319 


0.68224 


0.78195 


703 


. 


. 


0.73010 


705 


. 


. 


0.70417 


710 


. 


. 


0.67490 


717 


0.54361 


0.89345 


. 


724 


0.48891 


0.64371 


. 


752 


0.54953 


0.69207 


. 


756 


0.55418 


0.73575 


. 


764 


0.51430 


0.61442 


0.71423 


788 


. 


. 


0.54241 


792 


. 


. 


0.65385 


807 


. 


. 


0.71579 


826 


0.60866 


0.72191 


. 


827 


0.58280 


0.71281 


. 


840 


0.61213 


0.74919 


0 . 77763 


890 


0.55386 


0.70312 


. 


926 


. 


. 


0.76603 


956 


• 


• 


0.67450 



Notes: (1) The country ID follows specification of the TIMSS codebook. 

(2) For those countries did not participate both TIMSS and TIMSS-R, the sign 
is the default for missing observations 

(3) r_95 : Correlation coefficients from the original TIMSS scale; 
r_new: TIMSS correlation coefficients on the new TIMSS-R scale; 
r 99 : Correlation coefficients from the TIMSS-R database. 
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Table 2 



Correlation between student performance and the indicator of math-science link 



r 95 r new95 r 99 



Mathematics achievement 0.24960 -0.15469 0.46436 *** 

Science achievement 0.25681 -0.25216 0.51684 *** 



Notes: (1) The math-science link in the achieved curriculum is described by the 
correlation between mathematics and science achievements. 

(2) *** indicate that the correlation is significant at 0.05 level. 

(3) r_95 : Correlation coefficients from the original TIMSS scale; 
r_new: TIMSS correlation coefficients on the new TIMSS-R scale; 
r 99 : Correlation coefficients from the TIMSS-R database. 



Figure 1: Plot of science and mathmatics scores 
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Figure 2: Plot of science and mathmatics scores 

TM$$ ResuRs on the TR/ISS^R Scale 
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Figure 3: Plot of mathmatics and science scores 
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Figure 4: Plot of correlation coefficients and math scores 

TIM$$ RmuKs on tho Original Scale 
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Figure 5: Plot of correlation coefficients and math scores 

TIMSS Rceutte on the TIMSS— R Scale 
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Figure 6: Plot of correlation coefficients and math scores 

TIMSS-R Re»utt% 



r_99 

(X80: 

a75: 

a70: 

aes: 

aeo: 

a95: 

0 , 50 - 

a45: 

g^ H 

200 300 




400 500 eoo 



700 



m 

r_09t correlation ooefficieiit; m: «vemge mMhemrtIo •core. 



Figure 7: Plot of correlation coefficients and science scores 

TIMSS Re«ult« on the Original Scale 
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Figure 8: Plot of correlation coefficients and science scores 

TIMSS R««ult% on the TIMSS— R Scale 
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Figure 9: Plot of correlation coefficients and science scores 

TIMSS-R ReeultQ 
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